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A novel family of dynamical Monte Carlo algorithms for lattice polymers is proposed. Our 
central idea is to simulate an extended ensemble in which the self-avoiding condition is systematically 
weakened. The degree of the self-overlap is controlled in a similar manner as the multicanonical 
ensemble. As a consequence, the ensemble — the multi-self-overlap ensemble — contains adequate 
portions of self-overlapping conformations as well as higher energy ones. It is shown that the multi- 
self-overlap ensemble algorithm reproduce correctly the canonical averages at finite temperatures 
of the HP model of lattice proteins. Moreover, it outperforms massively a standard multicanonical 
algorithm for a difficult example of a polymer with 8-stickers. Alternative algorithm based on 
exchange Monte Carlo method is also discussed. 
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Monte Carlo simulations of lattice polymers has been 
an important subject in a wide area of scientific re- 
searches, for example, statistical physics, physical chem- 
istry and theoretical biology. Simulations of lattice het- 
cropolymers, which consist of several different types of 
monomers, are especially interesting because they are 
minimal models of protein folding Jl|,||. Such simula- 
tions, however, often suffer from slow relaxations and 
metastability caused by the competition between short- 
ranged interactions and connectivity constraints among 
the monomers. For other systems with metastability, 
such as a spin system which exhibits the first-order phase 
transition, the multicanonical ensemble method is known 
to work well. || But for the heteroprolymer simulations, 
the multicanonical ensemble will not be a desired solu- 
tion. In fact, even a self-avoiding walk, which is the sim- 
plest lattice polymer, is not easy to generate [Q, although 
no interaction energy is assumed between monomers 
other than the constraint of the self-avoidingness. 

In this letter, we propose a novel approach to the dy- 
namical Monte Carlo simulation of lattice polymers. The 
present approach is a variant of Monte Carlo algorithms 
based on extended ensembles H|] |TIj and can be applied 
to a wide range of models including lattice heteropoly- 
mers and protein models on lattices. Our starting point 
is the introduction of an artificial ensemble that contains 
conformations with finite self-overlaps. With this re- 
laxation of the self-avoidingness, the conformations with 
self-overlaps play a role of "bridges" between metastable 
states and the rapid mixing of the Markov chain is ex- 
pected. In fact, Shakhnovich et al. have reported that 
the folding becomes considerably faster than usual in the 
lattice protein model that allowed doubly and triply self- 
overlapping conformations. |l2[ 

But it is not easy to make adequate portion of con- 
formations self-avoiding. Consider a dynamical simula- 
tion of self-avoiding walk. Once the self-overlap is al- 
lowed, almost all conformations in the ensemble become 
non self-avoiding, because self-overlapping conformations 
have larger entropy than self- avoiding ones. Clearly we 
need penalties to self-overlap, whose prototype is already 
seen in ]l2] |. The problem is, however, how to choose a 
penalty term with proper functional form and strength. 
Our solution to this problem is similar to that in the 
multicanonical ensemble. That is, the algorithm is de- 
signed to learn the appropriate values of the penalties 
for the violation of self-avoidingness in preliminary runs. 
After the values of the penalties are determined, a run 
for the measurement of physical quantities is performed. 
By measuring physical quantities only for self-avoiding 
conformations, we get correct canonical averages. 

The relaxation of self-avoidingness alone, however, is 
not sufficient for an ensemble of good performance when 
the models with attractive interactions are considered. 
With these models, the polymer tends to collapse at low 
temperatures, if finite self-overlaps are allowed. Such 
behavior is evidently not desirable for our purpose, be- 
cause the severely collapsed conformations have smaller 



entropy and do not work as good "bridges". Thus we 
need an ensemble extended to two directions , that is, the 
number of self-overlaps and the interaction energy. There 
are several ways to realize such an ensemble. Among 
them, we will discuss a particular type of ensemble - 
multi-self-overlap ensemble — in detail. 

Let us consider a heteropolymer of length N on a lat- 
tice G. The energy of the polymer in a conformation Y is 
denoted by E(Y). Then the canonical distribution that 
we want to sample by Monte Carlo simulations is 

P^r) cx exp(-/?£(r)). (I) 

In the conventional multicanonical algorithm |j|jL3fi we 
simulate the system with a modified probability 

P f (T) oc e X p(-/(£(r))), (2) 

where / is a function only of E. Only the self-avoiding 
conformations are allowed as Y. The form of the func- 
tion / is tuned so that the marginal distribution of E 
becomes as uniform as possible in a prescribed inter- 
val E m i n < E < E max . In actual implementations of 
the multicanonical algorithm, the appropriate values of 
f(E) are learned by the iterative construction of the en- 
ergy histogram through preliminary runs. ||||] Then the 
canonical average at inverse temperature (3 is calculated 
by the histogram reweighting. 

In the multi-self-overlap ensemble, self-overlapping 
conformations are allowed and the probability is mod- 
ified further as 

P g (Y)^exp(-g(E(Y),V(Y))). (3) 

Here V(r) is the effective energy associated with the self- 
overlaps in the conformation Y, which is defined as 

v(r) = X>(r)-i) 3 , (4) 

where n, (Y) is the number of the monomers on a lattice 
point i, and G' means a set of lattice points which are 
occupied by at least one monomer. When the polymer is 
in a self-avoiding conformation, V(r) = 0, and when it is 
collapsed into a pair of points, V(Y) ~ 2 • (N/2 — I) 2 ; 
values of V(Y) for any other conformations lie in be- 
tween these two limiting values. Here we assume that 
the definition of the original energy function E(Y) is ex- 
tended to the conformations with self-overlaps in a rea- 
sonable way. We tune the values of g(E, V) by a pre- 
liminary run so that we get a sufficiently flat bivariate 
marginal distribution of (E, V) in a prescribed range 
E mm < E < E max , < V < V max . Then a run for 
the measurement is performed. In the numerical sim- 
ulations in this paper, we use a method similar to the 
entropic sampling method. || The reweighting formula 
for the present algorithm is 

E^ fl - 1 (r i )exp(-/3E(r i )) ' 
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where I\ represents a conformation at ith Monte Carlo 
Step and the summations are taken over the self-avoiding 
conformations. 

Before discussing the numerical results, we will touch 
on related works on extended ensembles. Urakami and 
Takasu investigated lattice heteropolymers by means of 
the conventional multicanonical algorithm |l3| . Bruce 
et al. [0 calculated the free energy difference between 
crystals of different lattice structures using an extended 
ensemble. Although they treated very different systems, 
their idea had some common feature with ours. 

Now we consider a specific example as a touchstone of 
the algorighm, that is, the HP model |l4|] of protein on 
the square lattice. This model consists of a self-avoiding 
polymer chain on a lattice with two types of monomers 
("amino acids") H(hydrophobic) and P(polar). A con- 
formation of the polymer is indexed by r = {r^}, where 
n is the vector of coordinates of the ith monomer. The 
interaction u between a pair of monomers is defined as 
u(H,H) = -1, u{H,P) = u(P,H) = u{P,P) = 0. The 
energy E(r) of a conformation r of a polymer is written 
as 

E(r)=J2^S t ,S,)A( n -r 3 )^ (6) 

i<j 

where Si <E {H, P} is the index of the type of ith 
monomer. The factor A(r^ — rj) takes the value 1 if 
and only if r, and Tj are on adjoining lattice sites but 
not consecutive along the chain. We restrict the bond 
length between the consecutive monomers as 1, even for 
self-overlap conformations. Then, we can simply use the 
same definition of the energy eq. (^|) for all the confor- 
mations. We take V max = 4 throughout the following 
simulations. 

Next, we select a set of "elementary moves" used in 
the simulations. The following six moves are included, 
which are widely used in dynamical Monte Carlo sim- 
ulations of lattice polymers: jl6| (1) one-bead flip, (2) 
90° end-bond rotation, (3) 180° end-bond rotation, (4) 
±90° rotation, (5) 180° rotation, (6) rotation and reflec- 
tion. Wc further add the following two new moves which 
are possible only for self-overlapping conformations to 
the move set: (7a) "jackknife" shown in Fig. la and (8a) 
"loop overturn" shown in Fig. lb. Note that the degrees 
of freedom for the moves of the conventional types are 
increased when self-overlaps are allowed. 




FIG. 1. New elementary moves: (a) "jackknife" : If the 
monomers i — 1 and i + 1 are on the same site, the monomer 
i can be flipped into any site neighbouring to i ± 1. In the 
right figure, the monomer i is flipped to the site where the 
monomer i + 1 is already situated. (b)"loop overturn": The 
monomers i and j are on the same site. If a loop is formed 
between the monomers i and j which are on the same site, 
the loop can be overturned. 

Let us turn to numerical results. First we consider 
the sequence PH 2 P 2 H 2 P 2 H 3 P 2 HP of length 16, 0] 
which is taken from the reference. Jl5[ This sequence 
has nine distinguishable ground states which are not 
related with each other by symmetry. We calculate 
the temperature dependence of the "local field" fi = 
u{Si, Sj)A(ri — Tj) felt by the ith monomer. The re- 
sults are shown in Fig. 2 along with the data calculated 
by the exact enumeration of all the conformations. A 
good agreement to the exact result supports the validity 
of the present algorithm. 
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the heteropolymer of the length 16. Four typical local fields 
among 16 are shown. The symbols are the data calculated by 
the multi-self-overlap ensemble, and the line is the exact result 
calculated by the exact enumeration of all the conformations. 

Next we consider the "8-sticker" se- 
quence P 3 (HP 6 ) 7 HP 3 of length 56. g| The polymer 
with this sequence has a number of ground states. They 
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can be classified into three large groups by the combi- 
nations of contacting H monomers Simulations of 
such polymer are evidently far from trivial. We com- 
pare the performance of the multi-self-overlap algorithm 
with the conventional multicanonical algorithm for this 
example. Instead of the new elementary moves (7a) and 
(7b), we use the following two moves in the conventional 
multicanonical algorithm: (7b) 180° crankshaft and (8b) 
three-bead J flip. 

Time-series of the energy in both algorithms are plot- 
ted in Fig. 3. The definition of one Monte Carlo step 
is that all types of the moves are tried once for all the 
monomers, irrespective of whether the resulting confor- 
mation satisfies the self-avoiding condition or not. For 
the multi-self-overlap algorithm, the values of the en- 
ergy only of self-avoiding conformations are shown. It 
is clearly seen that inclusion of the self-overlapping con- 
formation accelerates the up-down itinerancy of energy 
very much. In Fig. 4, visit to each type of the ground 
states are recorded. Jumps between the ground states of 
different types are evidently more frequent in the multi- 
self-overlap simulation than in the conventional multi- 
canonical simulation. It suggests that the relaxation is 
much faster in the multi-self-overlap algorithm. We also 
make similar calculations by the multi-self-overlap algo- 
rithm without using a new global move (8a). As a result, 
we find that the performance slightly lowers than before 
but still is much higher than that of the conventional 
algorithm. 
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FIG. 3. Time series of the energy for the 8-sticker poly- 
mer of the length 56. The upper figure is for the con- 
ventional multicanonical ensemble, and the lower one is for 
the multi-self-overlap ensemble. Only the energy of the 
self-avoiding conformations are plotted in the lower figure; for 
counting the Monte Carlo steps, however, the self-overlapping 
conformations are also taken into account. 
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FIG. 4. Itinerancy among the ground states of three types 
against the Monte Carlo steps. The upper figure is for the 
conventional multicanonical ensemble, and the lower one is 
for the multi-self-overlap ensemble. The ordinate indicates 
the type of the ground states. 

So far, we discuss the performance in the measure- 
ment runs. On the other hand, our experience shows 
that the multi-self-overlap algorithm has better perfor- 
mance even in the preliminary runs, i.e., the tuning of 
the weight requires much shorter CPU time than in the 
conventional multicanonical algorithm, despite the two- 
dimensional nature of the histogram to be constructed in 
the preliminary runs. Actually, we need a few iterations 
each of which consists only of 50000 Monte Carlo steps 
to obtain sufficiantly uniform histogram by the multi- 
self-overlap algorithm. On the other hand, at least 10 6 
Monte Carlo steps are needed in each iteration by the 
conventional multicanonical algorithm. 

Finally, we briefly discuss a different implementation 
of our idea. Here we construct an extended ensemble fol- 
lowing a method similar to the exchange Monte Carlo al- 
gorithm |Io|]g[l (equivalently, Metropolis-coupled Markov 
chain Monte Carlo algorithm or, Time-homogeneous 
annealing algorithm (s)) . Instead of a series of canon- 
ical ensembles with different temperatures in the usual 
exchange algorithm |icf| , we introduce a series of n mod- 
ified distributions {P k (T k )} (1 < k <n), where 



P k (T k ) oc eM-PkE(T k ) - X k V{T k ) ) 



(7) 



with different degrees of penalties {X k } to self-overlap 
and inverse temperatures {/?&}• The value of A„ is set 
large enough to prevent self-overlap and [3 n as the inverse 
temperature [3 originally required in the problem. The 
exchange algorithm is defined as follows: n dynamical 
Monte Carlo simulations are performed in parallel, each 
for fcth parameter set {X k ,f3 k ), with a pair of the con- 
formations T k and T k > being exchanged at a prescribed 
interval according to a probability 
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max < 1 



Pk(T k >)-Pk>(Tk) 

p k {T k )-p k ,{r k ,) 



(8) 



The equilibrium distribution of this coupled Markov 
chain is the simultaneous distribution HkPk- Then, a 
sample from the distribution P n is regarded as a sample 
from the desired distribution. 

In this paper, we proposed a novel family of dynami- 
cal Monte Carlo algorithms for the simulation of lattice 
polymers. The essence of our algorithms is the intro- 
duction of extended ensembles in which the self-avoiding 
conditions are systematically weakened. We discussed 
two different implementations of the idea. Both ensem- 
bles contains an adequate portion of higher energy con- 
formations as well as self-overlapping conformations. We 
perform the numerical simulations using one of them, the 
multi-self-overlap ensemble algorithm. It achieved supe- 
rior performance compared with the conventional mul- 
ticanonical algorithm in the hard problem of 8-sticker 
HP polymers. The following point should be stressed: 
Although only a portion of the generated conformations 
satisfy the self-avoiding condition (about 1/5 in the simu- 
lations presented above, since V max = 4) , we still can get 
much more statistically independent samples compared 
with the conventional multicanonical ensemble, because 
the relaxation is accerelated very much. We also note 
that the proposed algorithms correctly reproduce the 
canonical averages at finite temperature when the aver- 
ages are taken over self-avoiding conformations. 

We believe that the present algorithm is powerful and 
general tool to investigate lattice heteropolymers. An in- 
teresting application will be in the design of lattice pro- 
teins Pq| . Research in this direction is now in progress 
and will be published in the forthcoming paper |l£| ]. Fi- 
nally, we point out that the idea behind the multi-self- 
overlap ensemble is easily extended to simulations of off- 
lattice polymer models. Application to the finite tem- 
perature simulations of the realistic protein models will 
also be an interesting problem, where the conventional 
multicanonical ensemble method is currently used. EG] 
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